A Tool for Literary Studies: Intertextual Distance and Tree Classification

نویسندگان

  • Cyril Labbé
  • Dominique Labbé
چکیده

How to measure proximities and oppositions in large text corpora? Intertextual distance provides a simple and interesting solution. Its properties make it a good tool for text classification, and especially for tree-analysis which is fully presented and discussed here. In order to measure the quality of this classification, two indices are proposed. The method presented provides an accurate tool for literary studies -as is demonstrated by applying it to two areas of French literature, Racine's tragedies and an authorship attribution experiment. Résumé Comment mesurer les proximités et les oppositions dans les grands corpus de texts ? La distance intertextuelle offre une solution simple et intéressante. Ses propriétés en font un excellent outil pour la classification des texts, spécialement la classification arborée qui est présentée et discutée de manière exhaustive. Deux indices sont proposés pour contrôler la qualité de cette classification. Cette méthode présente un outil très utile pour les études littéraires comme le montrent deux applications dans deux domaines : les tragédies de Racine et une expérience d'attribution d'auteur. lexical statistics ; intertextual distance ; clustering ; tree-analysis ; French literature ; Racine ; authorship attribution

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Experiments on authorship attribution by intertextual distance in English

How can it be said that texts are "near" or "distant" from one another? Are different texts by a single author more similar than texts by different authors? To answer these questions, a method is proposed by combination of the calculus of intertextual distance with automatic clustering and tree-classification. A blind test and some additional experiments show that this method offers an interest...

متن کامل

Fault Detection and Classification in Double-Circuit Transmission Line in Presence of TCSC Using Hybrid Intelligent Method

In this paper, an effective method for fault detection and classification in a double-circuit transmission line compensated with TCSC is proposed. The mutual coupling of parallel transmission lines and presence of TCSC affect the frequency content of the input signal of a distance relay and hence fault detection and fault classification face some challenges. One of the most effective methods fo...

متن کامل

Comparison of Performance in Image Classification Algorithms of Satellite in Detection of Sarakhs Sandy zones

Extended abstract 1- Introduction Wind erosion as an “environmental threat” has caused serious problems in the world. Identifying and evaluating areas affected by wind erosion can be an important tool for managers and planners in the sustainable development of different areas.  nowadays there are various methods in the world for zoning lands affected by wind erosion. One of the most important...

متن کامل

Steel Buildings Damage Classification by damage spectrum and Decision Tree Algorithm

Results of damage prediction in buildings can be used as a useful tool for managing and decreasing seismic risk of earthquakes. In this study, damage spectrum and C4.5 decision tree algorithm were utilized for damage prediction in steel buildings during earthquakes. In order to prepare the damage spectrum, steel buildings were modeled as a single-degree-of-freedom (SDOF) system and time-history...

متن کامل

Comparison of Machine Learning Algorithms for Broad Leaf Species Classification Using UAV-RGB Images

Abstract: Knowing the tree species combination of forests provides valuable information for studying the forest’s economic value, fire risk assessment, biodiversity monitoring, and wildlife habitat improvement. Fieldwork is often time-consuming and labor-required, free satellite data are available in coarse resolution and the use of manned aircraft is relatively costly. Recently, unmanned aeria...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • LLC

دوره 21  شماره 

صفحات  -

تاریخ انتشار 2006